Gene cluster involved in safracin biosynthesis and its uses for genetic engineering

ABSTRACT

A gene cluster is disclosed having open reading frames which encode polypeptides sufficient to direct the synthesis of a safracin molecule. In addition, the present disclosure is directed to a nucleic acid sequence, suitably an isolated nucleic acid sequence, which includes or comprises at least SEQ ID NO:1, variants or portions thereof, or at least one of the sacA, sacB, sacC, sacC, sacD, sacE, sacF, sacG, sacH, sacH, saI, sacJ, orf1, orf2, orf3 or orf4 genes, including variants or portions.

FIELD OF THE INVENTION

The present invention relates to the gene cluster responsible for the biosynthesis of safracin, its uses for genetic engineering and new safracins obtained by manipulation of the biosynthesis mechanism.

BACKGROUND OF THE INVENTION

Safracins, a family of new compounds with a potent broad-spectrum antibacterial activity, were discovered in a culture broth of Pseudomonas sp. Safracin occurs in two Pseudomonas sp. strains, Pseudomonas fluorescens A2-2 isolated from a soil sample collected in Tagawagun, Fukuoka, Japan (Ikeda et al. J. Antibiotics 1983, 36,1279-1283; WO 82 00146 and JP 58113192) and Pseudomonas fluorescens SC 12695 isolated from water samples taken from the Raritan-Delaware Canal, near New Jersey (Meyers et al. J. Antibiot. 1983, 36(2), 190-193). Safracins A and B, produced by Pseudomonas fluorescens A2-2, have been examined against different tumor cell lines and has been found to possess antitumor activity in addition to antibacterial activity.

Due to the structural similarities between safracin B and ET-743 safracin offers the possibility of hemi-synthesis of the highly promising potent new antitumor agent ET-743, isolated from the marine tunicate Ecteinascidia turbinata and which is currently in Phase II clinical trials in Europe and the United States. A hemisynthesis of ET-743 has been achieved starting from safracin B (Cuevas et al. Organic Lett. 2000, 10, 2545-2548; WO 00 69862 and WO 01 87895).

As an alternative of making safracins or its structural analogs by chemical synthesis, manipulating genes of governing secondary metabolism offer a promising alternative and allows for preparation of these compounds biosynthetically. Additionally, safracin structure offers exciting possibilities for combinatorial biosynthesis.

In view of the complex structure of the safracins and the limitations in their obtention from Pseudomonas fluorescens A2-2, it would be highly desirable to understand the genetic basis of their synthesis in order to create the means to influence them in a targeted manner. This could increase the amounts of safracins being produced, because natural production strains generally yield only low concentrations of the secondary metabolites that are of interest. It could also allow the production of safracins in hosts that otherwise do not produce these compounds. Additionally, the genetic manipulation could be used for combinatorial creation of novel safracin analogs that could exhibit improved properties and that could be used in the hemi-synthesis of new ecteinascidins compounds.

However, the success of a biosynthetic approach depends critically on the availability of novel genetic systems and on genes encoding novel enzyme activities. Elucidation of the safracin gene cluster contributes to the general field of combinatorial biosynthesis by expanding the repertoire of genes uniquely associated with safracin biosynthesis, leading to the possibility of making novel precursors and safracins via combinatorial biosynthesis.

SUMMARY OF THE INVENTION

We have now been able to identify and clone the genes of safracin biosynthesis, providing the genetic basis for improving and manipulating in a targeted manner the productivity of Pseudomonas sp., and using genetic methods, for synthesising safracin analogues. Additionally, these genes encode enzymes that are involved in biosynthetic processes to produce structures, such as safracin precursors, that can form the basis of combinatorial chemistry to produce a wide variety of compounds. These compounds can be screened for a variety of bioactivities including anticancer activity.

Therefore in a first aspect the present invention provides a nucleic acid, suitably an isolated nucleic acid, which includes a DNA sequence (including mutations or variants thereof, that encodes non-ribosomal peptide synthetases which are responsible for the biosynthesis of safracins. This invention provides a gene cluster, suitably an isolated gene cluster, with open reading frames encoding polypeptides to direct the assembly of a safracin molecule.

One aspect of the present invention is a composition including at least one nucleic acid sequence, suitably an isolated nucleic acid molecule, that encodes at least one polypeptide that catalyses at least one step of the biosynthesis of safracins. Two or more such nucleic acid sequences can be present in the composition. DNA or corresponding RNA is also provided.

In particular the present invention is directed to a nucleic acid sequence, suitably an isolated nucleic acid sequence, from a safracin gene cluster comprising said nucleic acid sequence, a portion or portions of said nucleic acid sequence wherein said portion or portions encode a polypeptide or polypeptides or a biologically active fragment of a polypeptide or polypeptides, a single-stranded nucleic acid sequence derived from said nucleic acid sequence, or a single stranded nucleic acid sequence derived from a portion or portions of said nucleic acid sequence, or a double-stranded nucleic acid sequence derived from the single-stranded nucleic acid sequence (such as cDNA from mRNA). The nucleic acid sequence can be DNA or RNA.

More particularly, the present invention is directed to a nucleic acid sequence, suitably an isolated nucleic acid sequence, which includes or comprises at least SEQ ID 1, variants or portions thereof, or at least one of the sacA, sacB, sacC, sacC, sacD, sacE, sacF, sacG, sacH, sacH, sacI, sacJ, orf1, orf2, orf3 or orf4 genes, including variants or portions. Portions can be at least 10, 15, 20, 25, 50, 100, 1000, 2500, 5000, 10000, 20000, 25000 or more nucleotides in length. Typically the portions are in the range 100 to 5000, or 100 to 2500 nucleotides in length, and are biologically functional.

Mutants or variants include polynucleotide molecules in which at least one nucleotide residue is altered, substituted, deleted or inserted. Multiple changes are possible, with a different nucleotide at 1, 2, 3, 4, 5, 10, 15, 25, 50, 100, 200, 500 or more positions. Degenerate variants are envisaged which encode the same polypeptide, as well as non-degenerate variants which encode a different polypeptide. The portion, mutant or variant nucleic acid sequence suitably encodes a polypeptide which retains a biological activity of the respective polypeptide encoded by any of the open reading frames of the safracin gene cluster. Allelic forms and polymorphisms are embraced.

The invention is also directed to an isolated nucleic acid sequence capable of hybridizing under stringent conditions with a nucleic acid sequence of this invention. Particularly preferred is hybridisation with a translatable length of a nucleic acid sequence of this invention.

The invention is also directed to a nucleic acid encoding a polypeptide which is at least 30%, preferably 50%, preferably 60%, more preferably 70%, in particular 80%, 90%, 95% or more identical in amino acid sequence to a polypeptide encoded by any of the safracin gene cluster open reading frames sacA to sacJ and orf1 to orf4 (SEQ ID 1 and genes encoded in SEQ ID 1) or encoded by a variant or portion thereof. The polypeptide suitably retains a biological activity of the respective polypeptide encoded by any of the safracin gene cluster open reading frames.

In particular, the invention is directed to an isolated nucleic acid sequence encoding for any of SacA, SacB, SacC, SacD, SacE, SacF, SacG, SacH, SacI, SacJ, Orf1, Orf2, Orf3 or Orf4 proteins (SEQ ID 2-15), and variants, mutants or portions thereof.

In one aspect, an isolated nucleic acid sequence of this invention encodes a peptide synthetase, a L-Tyr derivative hidroxylase, a L-Tyr derivative methylase, a L-Tyr O-methylase, a methyl-transferase or a monooxygenase or a safracin resistance protein.

The invention also provides a hybridization probe which is a nucleic acid sequence as defined above or a portion thereof. Probes suitably comprise a sequence of at least 5, 10, 15, 20, 25, 30, 40, 50, 60, or more nucleotide residues. Sequences with a length on the range 25 to 60 are preferred. The invention is also directed to the use of a probe as defined for the detection of a safracin or ecteinascidin gene. In particular, the probe is used for the detection of genes in Ecteinascidia turbinata.

In a related aspect the invention is directed to a polypeptide encoded by a nucleic acid sequence as defined above. Full sequence, variant, mutant or fragment polypeptides are envisaged.

In a further aspect the invention is directed to a vector, preferably an expression vector, preferably a cosmid, comprising a nucleic acid sequence encoding a protein or biologically active fragment of a protein, wherein said nucleic acid is as defined above.

In another aspect the invention is directed to a host cell transformed with one or more of the nucleic acid sequences as defined above, or a vector, an expression vector or cosmid as defined above. A preferred host cell is transformed with an exogenous nucleic acid comprising a gene cluster encoding polypeptides sufficient to direct the assembly of a safracin or safracin analog. Preferably the host cell is a microorganism, more preferably a bacteria.

The invention is also directed to a recombinant bacterial host cell in which at least a portion of a nucleic acid sequence as defined above is disrupted to result in a recombinant host cell that produces altered levels of safracin compound or safracin analogue, relative to a corresponding nonrecombinant bacterial host cell.

The invention is also directed to a method of producing a safracin compound or safracin analogue comprising fermenting, under conditions and in a medium suitable for producing such a compound or analogue, an organism such as Pseudomonas sp, in which the copy number of the safracin genes/cluster encoding polypeptides sufficient to direct the assembly of a safracin or safracin analog has been increased.

The invention is also directed to a method of producing a safracin compound or analogue comprising fermenting, under conditions and in a medium suitable for producing such compound or analogue, an organism such as Pseudomonas sp in which expression of the genes encoding polypeptides sufficient to direct the assembly of a safracin or safracin analogue has been modulated by manipulation or replacement of one or more genes or sequence responsible for regulating such expression. Preferably expression of the genes is enhanced.

The invention is also directed to the use of a composition including at least one isolated nucleic acid sequence as defined above or a modification thereof for the combinatorial biosynthesis of non-ribosomal peptides, diketopiperazine rings and safracins.

In particular the method involves contacting a compound that is a substrate for a polypeptide encoded by one or more of the safracin biosynthesis gene cluster open reading frames as defined above with the polypeptide encoded by one or more safracin biosynthesis gene cluster open reading frames, whereby the polypeptide chemically modifies the compound.

In still another embodiment, this invention provides a method of producing a safracin or safracin analog. The method involves providing a microorganism transformed with an exogenous nucleic acid comprising a safracin gene cluster encoding polypeptides sufficient to direct the assembly of said safracin or safracin analog; culturing the bacteria under conditions permitting the biosynthesis of safracin or safracin analog; and isolating said safracin or safracin analog from said cell.

The invention is also directed to any of the precursor compounds P2, P14, analogs and derivatives thereof and their use in the combinatorial biosynthesis non-ribosomal peptides, diketopiperazine rings and safracins.

Additionally, the invention is also directed to the new safracins obtained by knock out safracin P19B, safracin P22A, safracin P22B, safracin D and safracin E, and their use as antimicrobial or antitumor agents, as well as their use in the synthesis of ecteinascidin compounds.

The invention is also directed to new safracins obtained by directed biosynthesis as defined above, and their use as antimicrobial or antitumor agents, as well as their use in the synthesis of ecteinascidin compounds. In particular the invention is directed to safracin B-ethoxy and safracin A-ethoxy and their use.

In one aspect, the present invention enables the preparation of structures related to safracins and ecteinascidins which cannot or are difficult to prepare by chemical synthesis. Another aspect is to use the knowledge to gain access to the biosynthesis of ecteinascidins in Ecteinascidia turbinata, for example using these sequences or parts as probes in this organism or a putative symbiont.

More fundamentally, the invention opens a broad field and gives access to ecteinascidins by genetic engineering.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1: Structural organization of the chromosomal DNA region cloned in pL30p cosmid. The region of P. fluorescens A2-2 DNA, containing the safracin gene cluster, is shown. Both, sacABCDEFGH and sacIJ, gene operons and the modular organization of the peptide synthetases deduced from sacA, sacB and sacC are illustrated. The following domains are indicated: C: condensation; T: thiolation; A: adenylation and Re: reductase. Location of other genes present in pL30p cosmid (orf1 to orf4) as well as their proposed function is shown.

FIG. 2: Conserved core motifs between NRPSs. Conserved amino acid sequences in SacA (Residues 484-953 of SEQ ID NO: 2), SacB (Residues 525-999 of SEQ ID NO: 3) and SacC (Residues 516-992 of SEQ ID NO: 4) proteins and their comparison with its homologous sequences from Myxococcus xanthus DM50415 (Core sequences disclosed as SEQ ID NOS: 26-30; SafB1, SafB2, SafA1 and SafA2 disclosed as SEQ ID NOS: 31-34, respectively, in order of appearance).

FIG. 3. NRPS biosynthesis mechanism proposed for the formation of the Ala-Gly dipeptide. Step a*, adenylation of Ala; b*, transfer to the 4′-phosphopantetheinyl arm; c*, transfer to the waiting/elongation site; d*, adenylation of the Gly; e*, transfer to the 4′-phosphopantetheinyl arm; f*, condensation of the elongation chain on the 4′-phosphopantetheinyl arm with the starter chain at the waiting/elongation site; g*, Ala-Gly dipeptide attached to the phosphopantetheinyl arm of SacA and h*, transfer of the elongated chain to the following waiting/elongation site.

FIG. 4: Cross-feeding experiments. A. Scheme of A2-2 DNA fragments cloned in pBBR1-MCS2 vector and products obtained in the heterologous host. B. HPLC profile of safracin production in wild type strain versus sacF mutant. The addition of P2 precursor to the sacF mutant, provided both in trans and synthetically, yield safracin B production. SfcA, safracin A and SfcB, safracin B.

FIG. 5: Scheme of the safracin biosynthesis mechanism and biosynthetic intermediates. Single enzymatic steps are indicated by a continuous arrow and multiple reactions steps are indicated by discontinuous arrows.

FIG. 6: Safracin gene disruptions and compounds produced. A. Gene disruption and precursor molecules synthesized by the mutants constructed. Gene marked With an asterisk does not belong to the safracin cluster. Inactivation of genes orf1, orj2, orf3 and orf4 has demonstrated to have no effect over safracin production. B. HPLC profile of safracin production in wild type strain and in sacA, sacI and sacJ mutants. Structure of the different molecules obtained is shown.

FIG. 7: Structure of the different molecules obtained by gene disruption. Inactivation of SacJ protein (a) yields P22B, P22A and P19 molecules, whereas gene disruption of sacI (b), produces only P19 compound. The sacI disruption, together with the sacJ reconstructed expression, produces two new safracins: safracin D (possible precursor for ET-729 hemi-synthesis) and safracin E (c).

FIG. 8: Addition of specific designed “unnatural” precursors (P3). Chemical structure of the two molecules obtained by addition of P3 compound to the sacF mutant.

FIG. 9: Scheme of the gene disruption event through simple recombination, using an homologous DNA fragment cloned into pK18:MOB (an integrative plasmid in Pseudomonas).

DETAILED DESCRIPTION OF THE INVENTION

Non ribosomal peptide synthetases (NRPS) are enzymes responsible for the biosynthesis of a family of compounds that include a large number of structurally and functionally diverse natural products. For example, peptides with biological activities provide the structural backbone for compounds that exhibit a variety of biological activities such as, antibiotics, antiviral, antitumor, and immunosuppressive agents (Zuber et al. Biotechnology of Antibiotics 1997 (W. Strohl, ed.), 187-216 Marcel dekker, Inc., N.Y; Marahiel et al. Chem. Rev. 1997, 97, 2651-2673). Although structurally diverse, most of these biologically active peptides share a common mechanistic scheme of biosynthesis. According to this model, peptide bond formation takes place on multienzymes designated peptides synthetases, on which amino acid substrates are activated by ATP hydrolysis to the corresponding adenylate. This unstable intermediate is subsequently transferred to another site of the multienzymes where it is bound as a thioester to the cysteamine group of an enzyme-bound 4′-phosphopantetheninyl (4′-PP) cofactor. At this stage, the thiol-activated substrates can undergo modifications such as epimerisation or N-methylation. Thioesterified substrate amino acids are then integrated into the peptide product through a step-by-step elongation by a series of transpeptidation reactions. With this template arrangement in peptide synthetases, the modules seem to operate independently of one another, but they act in concert to catalyse the formation of successive peptide bonds (Stachelhaus et al. Science 1995, 269, 69-72; Stachelhaus et al. Chem. Biol. 1996, 3, 913-921). The general scheme for non-ribosomal peptide biosynthesis has been widely reviewed (Marahiel et al. Chem. Rev. 1997, 97, 2651-2673; Konz and Marahiel, Chem. and Biol. 1999, 6, R39-R48; Moffit and Neilan, FEMS Microbiol. Letters 2000, 191, 159-167).

A large number of bacterial operons and fungal genes encoding peptide synthetases have recently been cloned, sequenced and partially characterized, providing valuables insights into their molecule architecture (Marahiel, Chem and Biol. 1997, 4, 561-567). Different cloning strategies were used, including probing of expression libraries by antibodies raised against peptide synthetases, complementation of deficient mutants, and the use of designed oligonucleotides derived from amino acid sequences of peptide synthetase fragments.

Analysis of the primary structure of these genes revealed the presence of distinct homologous domains of about 600 amino acids. This specific functional domains consist of at least six highly conserved core sequences of about three to eight amino acids in length, whose order and location within all known domains are very similar (Küsard and Marahiel, Peptide Research 1994, 7, 238-241). The used of degenerated oligonucleotides derived from the conserved cores opens the possibility of identifying and cloning peptide synthetases from genomic DNA, by using the polymerase chain reaction (PCR) technology (Küsard and Marahiel, Peptide Research 1994, 7, 238-241; Borchert et al. FEMS Microbiol Letters 1992, 92,175-180).

The structure of safracin suggests that this compound is synthesized by a NRPS mechanism. The cloning and expression of the non-ribosomal peptide synthetases and the associated tailoring enzymes from Pseudomonas fluorescens A2-2 safracin cluster would allow production of unlimited amounts of safracin. In addition, the cloned genes could be used for combinatorial creation of novel safracin analogs that could exhibit improved properties and that could be used in the hemi-synthesis of new ecteinascidins. Moreover, cloning and expressing the safracin gene cluster in heterologous systems or the combination of safracin gene cluster with other NRPS genes could result in the creation of novel drugs with improved activities.

The present invention provides, in particular, the DNA sequence encoding NRPS responsible for biosynthesis of safracin, i.e., safracin synthetases. We have characterized a 26,705 bp region (SEQ ID NO:1) from Pseudomonas fluorescens A2-2 genome, cloned in pL30P cosmid and demonstrated, by knockout experiments and heterologous expression, that this region is responsible for the safracin biosynthesis. We expressed the pL30P cosmid in two strains of Pseudomonas sp., which do not produce safracin, and the result was a production of safracin A and B at levels of a 22%, for P. fluorescens (CECT 378), and 2%, for P. aeruginosa (CECT 110), in comparison with P. fluorescens A2-2 production. The predicted amino acids sequences of the various peptides encoded by this DNA sequence is shown in SEQ ID NO:2 through SEQ ID NO:15 respectively.

The gene cluster for safracin biosynthesis derived from P. fluorescens A2-2, is characterized by the presence of several open reading frames (ORF) that are organized in two divergent operons (FIG. 1), an eight genes operon (sacABCDEFGH) and a two genes operon (sacIJ), preceded by well-conserved putative promoters regions that overlap. The safracin biosynthesis gene cluster is present in only one copy in P. fluorescens A2-2 genome.

Our results indicate that the eight genes operon would be responsible for the safracin skeleton biosynthesis and the two genes operon would be responsible for the final tailoring of safracins.

In the sacABCDEFGH operon, the deduced amino acid sequences encoded by sacA, sacB and sacC strongly resemble gene products of NRPSs. Within the deduced amino acid sequences of SacA, SacB and SacC, one peptide synthetase module was identified on each of the ORFs.

The first surprising feature of the safracin NRPS proteins is that from the known active sites and core regions of peptide synthetases (Konz and Marahiel, Chem. and Biol. 1999, 6, R39-R48), the first core is poorly conserved in all three peptide synthetases, SacA, SacB and SacC (FIG. 2). The other five core regions are well conserved in the three safracin NRPSs genes. The biological significance of the first core (LKAGA; SEQ ID NO: 16) is unknown, but the SGT(ST)TGxPKG (SEQ ID NO: 17) (Gocht and Marahiel, J. BacteHol. 1994, 176, 2654-266; Konz and Marahiel, Chem. and Biol. 1999, 6, R39-R48), the TGD(Gocht and Marahiel, J. Bactetiol. 1994, 176, 2654-2662; Konz and Marahiel, 1999) and the KIRGxRIEL (SEQ ID NO: 18) (Pavela-Vrancic et al. J. Biol. Chem 19942 269, 14962-14966; Konz; and Marahiel, Chem. and Biol. 1999, 6, R39-R48) core sequences could be assigned to ATP binding and hydrolysis. The serine residue of the core sequence LGGxS (SEQ ID NO: 19) could be shown to be the site of thioester formation (D'Souza et al., J. Bacteriol. 1993, 175, 3502 3510; Vollenbroich etal., FEBS Lett. 1993, 325(3), 220-4; Konz and Marahiel, Chem. and Biol. 1999, 6, R39-R48) and 4′-phosphopantetheine binding (Stein et al. FEBS Lett. 1994, 340, 39-44; Konz and Marahiel, Chem. and Biol. 1999, 6, R39-R48). These findings, together with the fact that safracin seems to be synthesized from amino acids, supports the hypothesis that non-ribosomal peptide bond formation via the thiotemplate mechanism is involved in the biosynthetic pathway of safracin and that sacA, sacB and sacC encode the corresponding peptide synthetases. According to this mechanism, amino acids are activated as aminoacyl adenylates by ATP hydrolysis and subsequently covalently bound to the enzyme via carboxyl-thioester linkages. Then, in further steps, transpeptidation and peptide bond formation occurs.

Secondly, it is striking that our sequence data clearly shows that the colinearity rule, according to which the order of the amino acid binding modules along the chromosome parallels the order of the amino acids in the peptide, does not hold for the safracin synthetase system. According to the sequence database homologies and safracin and saframycin structures homologies, SacA would be responsible for the recognition and activation of the Gly residue and SacB and SacC would be responsible for the recognition and activation of the two L-Tyr derivatives that are incorporated into the safracin skeleton, while the putative Ala-NRPS gene would be missing in the safracin gene cluster. In a few nonribosomal peptide synthetases gene clusters, such as in the pristamycin (Crecy-Lagard et al, J. of Bacteriol. 1997, 179(3), 705-713) and in the phosphinothricin tripeptide (Schwartz et al. Appl Environ Microbiol 1996, 62, 570-577) biosynthesis pathways, the first NRPS is not juxtaposed with the second NRPS gene. In concrete, in the pristamycin biosynthetic pathway the first structural gene (snbA) and the second structural gene (snbC) are 130 kb apart. This is not the case for the safracin gene cluster where the results of the heterologous expression with the pL30P cosmid clearly demonstrates that there is no NRPS gene missing since there is heterologous safracin production.

Thirdly, even though the question about the mechanism by which the dipeptide Ala-Gly is formed remains open, the presence in sacA of an extra C domain at the amino terminus of the first NRPS gene, suggests the possibility of a bifunctional adenylation activation activity by this gene. We propose that the Ala would be first charged on the phosphopantetheinyl arm of SacA (FIGS. 3 a* and b*) before being transferred to a waiting position, a condensation domain, located in N-terminal of saca (FIG. 3, c*). The Gly adenylate would then be charged on the same phosphopantetheinyl arm (FIGS. 3, d* and e*), positioned to the elongation site, and elongation would occur (FIG. 3, f*). The arm of the first module would at this stage be charged with a Ala-Gly dipeptide (FIG. 3, g*). We proposed that the dipeptide would then be transferred on a waiting position in the second phosphopantetheinyl arm (FIG. 3, h*), located in SacB, to continue the synthesis of the safracin tetrapeptide basic skeleton. An alternative biosynthesis mechanism could be the direct incorporation of a dipeptide Ala-Gly into SacA. In this case, the dipeptide could be originated from the activity of highly active peptidyl transferase ribozyme family (Sun et al, Chem. and Biol. 2002, 9, 619-626) or from the activity of bacterial proteolysis.

And fourthly, although in most of the prokaryotic peptide synthetases the thioesterase moiety, which appears to be responsible for the release of the mature peptide chain from the enzyme, is fused to the C-terminal end of the last amino acid binding module (Marahiel et al. Chem. Rev. 1997, 97, 2651-2673), in the case of safracin synthetases, the TE domain is missing. Probably, in the safracin synthesis after the last elongation step, the tetrapeptide could be released by an alternative strategy for peptide-chain termination that also occurs in the saframycin synthesis (Pospiech et al. Microbiol. 1996, 142, 741-746). This particular termination strategy is catalysed by a reductase domain at the carboxy-terminal end of the SacC peptide synthetase which catalyses the reductive cleavage of the associated T-domain-tethered acyl group, releasing a linear aldehyde.

Our cross feeding experiments indicate that the last two amino acids incorporated into the safracin molecule are two L-Tyr derivatives called P2 (3-hydroxy-5-methyl-O-methyltyrosine) (FIGS. 4, 5), instead of two L-Tyr as it is proposed to occur in saframycin synthesis. First, the products of two genes (sacF and sacG), similar to bacterial methyltransferases, have shown to be involved in the O-, C-methylation of L-Tyr to produce P14 (3-methyl-O-methyltyrosine), precursor of P2. A possible mechanism could envisage that the O-methylation occurs first and then the C-methylation of the amino acid derivative is produced. Secondly, P2, the substrate for the peptide synthetases SacB and SacC, is formed by the hydroxylation of P14 by SacD (FIGS. 4, 5).

Apart from the safracin biosynthetic genes, in the sacABCDEFGH operon there are also found two genes, sacE and sacH, involved in an unknown function and in the safracin resistance mechanism, respectively. We have demonstrated that sacH gene codes for a protein that when is heterologous expressed, in different Pseudomonas strains, a highly increase of the safracin B resistance is produced. SacH is a putative transmembrane protein, that transforms the C₂₁—OH group of safracin B into a C₂₁—H group, to produce safracin A, a compound with less antibiotic and antitumoral activity. Finally, even though still is unknown about the putative function of SacE, homologous of this gene have been found close to various secondary metabolites biosynthetic gene clusters in some microorganisms genomes, suggesting a conserved function of this genes in secondary metabolite formation or regulation.

In the sacIJ operon, the deduced amino acid sequences encoded by sacI and sacJ strongly resemble gene products of methyltransferase and hydroxylase/monoxygenase, respectively. Our data reveals that SacI is the enzyme responsible for the N-methylation present in the safracin structure, and that SacJ is the protein which makes an additional hydroxylation on one of the L-Tyr derivative incorporated into the tetrapeptide to produce the quinone structure present in all safracin molecules. N-Methylation is one of the modifications of nonribosomally synthesized peptides that significantly contributes to their biological activity. Except for saframycin (Pospiech et al. Microbiol. 1996, 142, 741-746), that is produced by bacteria and is N-methylated, all the N-methylated nonribosomal peptides known are produced by fungi or actinomycetes and, in most of the cases, the responsible for the N-methylation is a domain which reside in the nonribosomal peptide synthetase.

TABLE I Summary of safracin biosynthetic and resistance genes identified in cosmid pL30P. Pro- ORF tein Position Amino Molecular name name Proposed function start-stop bp acids weight sacA SacA Peptide synthetase 3052-6063 1004 110.4 sacB SacB Peptide synthetase 6068-9268 1063 117.5 sacC SacC Peptide synthetase  9275-13570 1432 157.3 sacD SacD L-Tyr derivative 13602-14651 350 39.2 hidroxylase sacE SacE Unknown 14719-14901 61 6.7 sacF SacF L-Tyr derivative 14962-16026 355 39.8 methylase sacG SacG L-Tyr O-methylase 16115-17155 347 38.3 sacH SacH Resistance protein 17244-17783 180 19.6 sacI SacI methyl-transferase 2513-1854 220 24.2 sacJ SacJ monooxygenase 1861-355  509 55.3

The safracin putative synthetic pathway, with indications of the specific amino acid substrates used for each condensation reaction and the various post-condensation activities, is shown in FIG. 5.

To further evaluate the role of safracin biosynthetic genes, we constructed knock out mutants of each of the genes of the safracin cluster (FIG. 6). The disruption of the NRPSs genes (sacA, sacB and sacC) as well as sacD, sacF and sacG, resulted in safracin and P2 non producing mutants. Our results indicate that the genes from sacA to sacH are part of the same genetic operon. As a consequence of the sacI and sacJ gene disruptions three new molecules were originated, P19B, P22A and P22B (FIG. 6).

The production of P22A and P22B (FIG. 7 a*) by sacJ mutant demonstrated that the role of the SacJ protein is to produce the additional hydroxylation of the left L-Tyr derivatives amino acid of the safracin, the one involved in the quinone ring. The production of P19B (FIG. 7 b*) by sacI mutant, a safracin like molecule where the N-methylation and the quinone ring are missing, confirms that SacI is the N-methyltransferase enzyme and suggests that sacIJ is a transcriptional operon. The production of P19B also by sacJ mutant (FIG. 7 a*) suggests that probably the N-methylation occurs after the quinone ring has been formed. Even though these new structures have no interesting antimicrobial activity on B. subtilis or no high citotoxic activity on cancer cells, they can serve as interesting new precursors for the hemisynthesis of new active molecules. As far as structure activity is concerned, the observation that P19B, P22A and P22B appear to loose their activity, suggests that the lost of the quinone ring from the safracin structure is directly related with the lost of activity of the safracin family molecules.

The disruption of sacI gene with the reconstitution of the sacJ gene expression resulted in the production of two new safracins. The two antibiotics produced, at levels of production as high as the levels of safracin A/safracin B production in the wild type strain, have been named as safracin D and safracin E (FIG. 7 c*).

The safracin D and safracin E are safracin B and safracin A like molecules, respectively, where the N-methylation is missing. Both, safracin D and safracin E have been shown to possess the same antibacterial and antitumoral activities as safracin B and safracin A, respectively. Apart from its high activities properties, antibacterial and antitumoral, safracin D could be used in the hemi-synthesis of the ecteinascidin ET-729, a potent antitumoral agent, as well as in the hemi-synthesis of new ecteinascidins.

A question arises concerning the role of the aminopeptidase-like protein coded by a gene located at 3′site of the safracin operon. The insertional inactivation of orf1 (PM-S1-14) showed no effect on safracin A/safracin B production. Because of its functionality properties it remains unclear if this protein could play some role in the safracin metabolism. The other genes present in the pL30P cosmid (orf2 to orf4) will have to be studied in more detail.

Another aspect of the invention is that it provides the tools necessary for the production of new specific designed “unnatural” molecules. The addition of a specific modified P2 derivative precursor named P3, a 3-hydroxy-5-methyl-O-ethyltyrosine, to the sacF mutant yields two “unnatural” safracins that incorporated this specific modified precursor, safracin A(OEt) and safracin B(OEt) (FIG. 8).

The two new safracins are potent antibiotic and antitumoral compounds. The biological activities of safracin A(OEt) and Safracin B(OEt) are as potent as the activities of safracin A and safracin B, respectively. These new safracins could be the source for new potent antitumoral agents, as well as a source of molecules for the hemi-synthesis of new ecteinascidins.

In addition, the genes involved in safracin synthesis could be combined with other non ribosomal peptide synthetases genes to result in the creation of novel “unnatural” drugs and analogs with improved activities.

EXAMPLES Example 1 Extraction of Nucleic Acid Molecules from Pseudomonas fluorescens A2-2

Bacterial Strains

Strains of Pseudomonas sp. were grown at 27° C. in Luria-Bertani (LB) broth (Ausubel et al. 1995, J. Wiley and Sons, New York, N.Y). E. coli strains were grown at 37° C. in LB medium. Antibiotics were used at the following concentrations: ampicillin (50 μg/ml), tetracycline (20 μg/ml) and kanamycin (50 μg/ml).

TABLE II Strains used in this invention. Code Genotype PM-S1-001 P. fluorescens A2-2 wild type PM-S1-002 sacA- PM-S1-003 sacB- PM-S1-004 sacC- PM-S1-005 sacJ- PM-S1-006 sacI- PM-S1-007 sacI- with sacJ expression reconstitution PM-S1-008 sacF- PM-S1-009 sacG- PM-S1-010 sacD- PM-S1-014 orf1- PM-S1-015 A2-2 + pLAFR3 PM-S1-016 A2-2 + pL30p PM-19-001 P. fluorescens CECT378 + pLAFR3 PM-19-002 P. fluorescens CECT378 + pL30p PM-19-003 P. fluorescens CECT378 + pBBR1-MCS2 PM-19-004 P. fluorescens CECT378 + pB5H83 PM-19-005 P. fluorescens CECT378 + pB7983 PM-19-006 P. fluorescens CECT378 + pBHPT3 PM-16-001 P. aeruginosa CECT110 + pLAFR3 PM-16-002 P. aeruginosa CECT110 + pL30p PM-17-003 P. putida ATCC12633 + pBBR1-MCS2 PM-17-004 P. putida ATCC12633 + pB5H83 PM-17-005 P. putida ATCC12633 + pB7983 PM-18-003 P. stutzeri ATCC17588 + pBBR1-MCS2 PM-18-004 P. stutzeri ATCC17588 + pB5H83 PM-18-005 P. stutzeri ATCC17588 + pB7983 DNA Manipulation

Unless otherwise noted, standard molecular biology techniques for in vitro DNA manipulations and cloning were used (Sambrook et al. 1989, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory).

DNA Extraction

Total DNA from Pseudomonas fluorescens A2-2 cultures was prepared as reported (Sambrook et al. 1989, Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory).

Computer Analysis

Sequence data were compiled and analysed using DNA-Star software package.

Example 2 Identification of NRPS Genes Responsible for Safracin Production in Pseudomonas fluorescens A2-2

Primer Design

Marahiel et at. (Marahiel et at. Chem. Rev. 1997, 97, 2651-2673) previously reported highly conserved core motifs of the catalytic domains of cyclic and branched peptide synthetases. Based on multiple sequence alignments of several reported peptide synthetases the conserved regions A2, A3, A4, A6, A7 and A8 of adenylation and T of thiolation modules were targeted for the degenerate primer design (Turgay and Marahiel, Peptide Res. 1994, 7, 238-241). The wobble positions were designed in respect to codon preferences within the selected modules and the expected high G/C content of Pseudomonas sp. All oligonucleotides were obtained from ISOGEN (Bioscience BV). A PCR fragment was obtained when degenerate oligonucleotides derived from the YGPTE (SEO ID NO: 35) (A5 core) and LGGXS (SEQ ID NO: 19) (T core) sequences were used. These oligonucleotides were denoted PS34-YG and PS6-FF, respectively.

TABLE III PCR primers designed for this study. (SEQ ID NOS: 20 and 21, respectively, in order of appearance) Primer designation and orientation Sequence Length PS34-YG (forward) 5′-TAYGGNCCNACNGA-3′ 14-mer PS6-FF (reverse) 5′-TSNCCNCCNADNTCRAARAA-3′ 20-mer PCR Conditions for Amplification of DNA from P. fluorescens A2-2

A fragment internal to nonribosomal peptide synthetases (NRPS) was amplified using PS-34-YG and PS6-FF oligonucleotides and P. fluorescens A2-2 chromosomal DNA as template. Reaction buffer and Taq polymerase from Promega were used. The cycling profile performed in a Personal thermocycler (Eppendorf) consists on: 30 cycles of 1 min at 95° C., 1 min at 50° C., 2 min at 72° C. PCR products were on the expected size (750 bp aprox.) based on the location of the primers within the NRPS domains of other synthetase genes.

DNA Cloning

PCR amplification fragments were cloned into pGEM-Teasy vector according to the manufacturer (Qiagen, Inc., Valencia, Calif.). In this way, cloned fragments are flanked by two EcoRI restriction sites, in order to facilitate subsequent subclonig in other plasmids (see below). Since NRPSs enzymes are modular, clones from the degenerated PCR primers represents a pool of fragments from different domains.

DNA Sequencing

All sequencing was performed using primers directed against the cloning vector, with an ABI Automated sequencer (Perkin-Elmer). Cloned DNA sequences were identified using the BLAST server of the National Center for Biotechnology Information accessed over the Internet (Altschul et al., Nucleic Acids Res. 1997, 25, 3389-3521). All of the sequences have signature regions for NRPSs and show high similarity in BLAST searches to bacterial NRPS showing that they are in fact of peptide origin. Moreover, a probable domain similarity search was performed using the PROSITE (European Molecular Biology Laboratory, Heidelberg, Germany) web server.

Gene Disruption of Pseudomonas fluorescens A2-2

In order to analyse the function of the genes cloned, these genes were disrupted through homologous recombination (FIG. 9). For this purpose, recombinant plasmids (pG-PS derivatives) harbouring the NRPS gene fragment were digested with EcoRI restriction enzyme. The resulting fragments belonging to the gene to be mutated were cloned into the pK18mob mobilizable plasmid (Schäfer et al. Gene 1994, 145, 69-73), a chromosomal integrative plasmid able to replicate in E. coli but not in Pseudomonas strains. Recombinant plasmids were introduced first in E. coli S17-λPIR strain by transformation and then in P. fluorescens A2-2 through biparental conjugation (Herrero et al, J Bacteriol 1990, 172, 6557-6567). Different dilutions of the conjugation were plated onto LB solid medium containing ampicillin plus kanamycin and incubated overnight at 27° C. Kanamycin-resistant transconjugants, containing plasmids integrated into the genome via homologous recombination, were selected.

Biological Assay (biotest) for Safracin Production

Strains P. fluorescens A2-2 and its derivatives were incubated in 50 ml baffled erlenmeyer flasks containing fermentation medium with the corresponding antibiotics. Initially, SA3 fermentation medium was used (Ikeda Y. J. Ferment. Technol. 1985, 63, 283-286). In order to increase the productivity of the fermentation process statistical-mathematical methods like Plackett-Burman designed was used to select nutrients and response surface optimisation techniques were tested (Hendrix C. Chemtech 1980, 10, 488-497) in order to determine the optimum level of each key independent variable. Experiments to improve the culture conditions like incubation temperature and agitation have also been done. Finally a highly safracin B producer medium named 16B (152 g/l of mannitol, 35 g/l of G20-25 yeast, 26 g/l of CaCO₃, 14 g/l of ammonium sulphate, 0.18 g/l of ferric chloride, pH 6.5) was selected.

The safracin production was assay testing the capacity of inhibition a Bacillus subtilis solid culture by 10 μl of the supernatant of a 3 days Pseudomonas sp. culture incubated at 27° C. (Alijah et al. Appl Microbiol Biotechnol 1991, 34, 749-755). P. fluorescens A2-2 cultures produce inhibition zones of 10-14 mm diameter while non-producing mutants did not inhibit B. subtilis growth. Three isolated clones had the safracin biosynthetic pathway affected. In order to confirm the results, HPLC analysis of safracin production was performed.

HPLC Analysis of Safracin Production

The supernatant was analysed by using HPLC Symmetry C-18. 300 Å, 5 μm, 250×4.6 mm column (Waters) with guard-column (Symmetry C-18, 5 μm 3.9×20 mm, Waters). An ammonium acetate buffer (10 mM, 1% Diethanolamine, pH 4.0)-acetonitrile gradient was the mobile phase. Safracin was detected by absorption at 268 nm. In FIG. 6, HPLC profile of safracin and safracin precursors produce by P. fluorescens A2-2 strain and different safracin-like structures produced by P. fluorescens mutants are shown.

Example 3 Cloning and Sequence Analysis of Safracin Cluster

Inverse PCR and Phage Library Hybridisation

Southern hybridisation on mutant chromosomal DNAs verified the correct gene disruption and demonstrated that the peptide synthethase fragment cloned into pK18mob plasmid was essential for the production of safracin. Analysis of the non safracin producers mutants obtained demonstrated that all of them had a gene disruption into the same gene, sacA.

Inverse PCR from genomic DNA and screening of a phage library of P. fluorescens A2-2 genomic DNA revealed the presence of additional genes flanking sacA gene, probably involved in safracin biosynthesis.

The GenBank accession number for the nucleotide sequence data of the P. fluorescens A2-2 safracin biosynthetic cluster is AY061859.

Cosmid Library Construction and Heterologous Expression

To determine whether safracin cluster was able to confer safracin biosynthetic capability to a non producer strain, it was cloned into a wide range cosmid vector (pLAFR3, Staskawicz B. et al. J Bacteriol 1987, 169, 5789-5794) and conjugated to a different Pseudomonas sp collection strains.

To obtain a clone containing the whole cluster, a cosmid library was constructed and screened. For this purpose, chromosomal DNA was partially digested with the restriction enzyme PstI, the fragments were dephosphorylated and ligated into the PstI site of cosmid vector pLAFR3. The cosmids were packaged with Gigapack III gold packaging extracts (Stratagene) as manufacturer's recommendations. Infected cells of strain XL1-Blue were plated on LB-agar supplemented with 50 μg/ml of tetracycline. Positives clones were selected using colony hybridization with a DIG-labeled DNA fragment belonging to the 3′-end of the safracin cluster. In order to ensure the cloning of the whole cluster, a new colony hybridization with a 5′-end DNA fragment was done. Only cosmid pL30p showed multiple hybridizations with DNA probes. To confirm the accurate cloning, PCR amplification and DNA-sequencing with DNA oligonucleotides belonging to the safracin sequence were carried out. The size of the insert of pL30P was 26,705 bp. The pL30p clone DNA was transformed into E. coli S17λPIR and the resulting strain were conjugated with the heterologous Pseudomonas sp. strains. The pL30p cosmid was introduced into P. fluorescens CECT378 and P. aeruginosa CECT110 by biparental conjugation as described above. Once a clone encoding the whole cluster was identified, it was determined whether the candidate was capable of producing safracin. Safracin production in the conjugated strains was assessed by HPLC analysis and biological assay of broth cultures supernatants as previously described.

The strain P. fluorescens CECT378 expressing the pL30p cosmid (PM-19-002) was able to produce safracin in considerable amounts, whereas safracin production in P. aeruginosa CECT110 strain expressing pL30P (PM-16-002) was 10 times less than the CECT378. Safracin production in these strains was about 22% and 2% of the total production in comparison with the natural producer strain.

Genes Involved in the Formation of Safracin. SEQUENCE Analysis of sacABCDEFGH and sacIJ Operons

Computer analyses of the DNA sequence of pL30P revealed 14 ORFs (FIG. 1). A potential ribosome binding site precedes each of the ATG start codons.

In the sacABCDEFGH operon, three very large ORFs, sacA, sacB and sacC (positions 3052 to 6063, 6080 to 9268 and 9275 to 13570 of the P. fluorescens A2-2 safracin sequence SEQ ID NO:1, respectively) can be read in the same direction and encode the putative safracin NRPSs: SacA (1004 amino acids, M_(r) 110452), SacB (1063 amino acids, M_(r) 117539) and SacC (1432 amino acids, M_(r) 157331). The three NRPSs genes contain the domains resembling amino acid activating domains of known peptide synthetases. Specifically, the amino acid activating domains from these NRPS genes are very similar to three of the four amino acid activating domains (Gly, Tyr and Tyr) found in the Myxococcus xanthus saframycin NRPSs (Pospiech et al. Microbiology 1995, 141, 1793-803; Pospiech et al. Microbiol. 1996, 142, 741-746). In particular, SacA (SEQ ID NO:2) shows 33% identity with saframycin Mx1 synthetase B protein (SafB) from M. xanthus (NCBI accession number U24657), whereas SacB (SEQ ID NO:3) and SacC (SEQ ID NO:4) share, respectively, 39% and 41% identity with saframycin Mx1 synthetase A (SafA) from M. xanthus (NCBI accession number U24657). The FIG. 2 shows a comparison among SacA, SacB y SacC and the different amino acid activating domains of saframycin NRPS.

Downstream sacC five small ORFs reading in the same direction as the NRPSs genes exist (FIG. 1). The first one, sacD (position 13602 to 14651 of P. fluorescens A2-2 safracin sequence), encodes a putative protein, SacD (350 amino acids, M_(r) 39187; SEQ ID NO:5), with no similarities in the GeneBank DB. The next one, sacE (position 14719 to 14901 of P. fluorescens A2-2 safracin sequence), encodes a small putative protein called SacE (61 amino acids, M_(r) 6729; (SEQ ID NO:6)), which shows some similarity with proteins of unknown function in the databases (ORF1 from Streptomyces viridochromogenes (NCBI accession number Y17268; 44% identity) and MbtH from Mycobacterium tuberculosis (NCBI accession number Z95208; 36% identity). The third ORF, sacF (position 14962 to 16026 of P. fluorescens A2-2 safracin sequence), encodes a 355-residue protein with a molecular weigh calculated of 39,834 (SEQ ID NO:7). This protein most closely resembles hydroxyneurosporene methyltransferase (CrtF) from Chloroflexus aurantiacus (NCBI accession number AF288602; 25% identity). The nucleotide sequence of the fourth ORF, sacG (position 16115 to 17155 of P. fluorescens A2-2 safracin sequence), predicted a gene product of 347 amino acids having a molecular mass of 38,22 kDa (SEQ ID NO:8). The protein, called SacG, is similar to bacterial O-methyltransferases, including O-dimethylpuromycin-O-methyltransferase (DmpM) from Streptomyces anulatus (NCBI accession number P42712; 31% identity). A computer search also shows that this protein contains the three sequence motifs found in diverse S-adenosylmethionine-dependent methytransferases (Kagan and Clarke, Arch Biochem. Biophys. 1994, 310, 417-427). The fifth gene, sacH (position 17244 to 17783 of P. fluorescens A2-2 safracin sequence), encodes a putative protein SacH (180 amino acids, M_(r) 19632; (SEQ ID NO:9). A computer search for similarities, between the deduced amino acid sequence of SacH and other protein sequences, revealed identity with some conserved hypothetical proteins of unknown function, which contains a well conserved transmembrane motif and a dihydrofolate reductase-like active site (Conserved hypothetical protein from Pseudomonas aeruginosa PAO1, NCBI accession number P3469; 35% identity).

Upstream sacABCDEFGH operon, reading in opposite sense, a two genes operon, sacIJ, is located. The sacI gene (position 2513 to 1854) encodes a 220-amino acids protein (M_(r) 24219; (SEQ ID NO: 10) that most closely resembles ubiquinone/manequinone methyltrasnferase from Thermotoga maritime (NCBI accession number AE001745; 32% identity). The sacJ gene (position 1861 to 335) encodes a 509-amino acid protein (SEQ ID NO:11), with a molecular mass of 55341 Da, similar to bacterial monooxygenases/hydroxylases, including putative monooxygenase from Bacillus subtilis (NCBI accession number Y14081; 33% identity) and Streptomyces coelicolor (NCBI accession number AL109972; 29% identity).

SacABCDEFGH and sacIJ operons are transcribed divergently and are separated by 450 bp approximately. Both operons are flanked by residual transposase fragments.

Related Safracin Cluster Genes

A putative ORF (orf1; position 18322 to 19365 of P. fluorescens A2-2 safracin sequence) located at the 3′-end of the safracin sequence has been found (FIG. 1). ORF1 protein (SEQ ID NO:12) shows similarity with aminopeptidases from the Gene Bank DataBase (peptidase M20/M25/M40 family from Caulobacter crescentus CB15; NCBI accession number NP422131; 30% identity). Using the strategy described in Example 2, the gene disruption of orf1 do not affect safracin production in P. fluorescens A2-2.

At the 3′-end of the safracin sequence cloned in pL30p cosmid, three putative ORFs (orf2, orf3 and orf4), were found. Reading in opposite direction than sacABCDEFGH operon, orf2 gene (position 22885 to 21169 of SEQ ID NO:1) codes for a protein, ORF2 (SEQ ID NO:13), with similarities to Aquifex aeolicus HoxX sensor protein (NCBI accession number NC000918.1; 35% identity), whereas orf3 gene (position 23730 to 23041 of SEQ ID NO:1) codes for ORF3 protein (SEQ ID NO:14) which shares 44% identity with a glycosil transferase related protein from Xanthomonas axonopodis pv. Citri str. 306 (NCBI accession number NP642442).

The third gene is located at the 3′-end of SEQ ID NO:1 (position 25037 to 26095). This gene, named orf4 (position 2513 to 1854), encodes a protein, ORF4 (SEQ ID NO:15), that most closely resembles to a hypothetical isochorismatase family protein YcdL from Escherichia coli. (NCBI accession number P75897; 32% identity).

Presumably, these three genes would not be involve in the safracin biosynthetic pathway, however, future gene disruption of these genes will confirm this assumption.

The different DNA sequences found are listed at the end of the description.

Example 4 Functional Analysis of the Safracin Loci and Search for Possible Precursors

Since the pathway for synthesis of safracin in P. fluorescens A2-2 is at present unknown, the inactivation of each of the genes described in Example 3 would permit fundamental studies on the mechanism of safracin biosynthesis in this strain.

In order to analyze the functionality of each particular protein in the safracin production pathway, disruption of each particular gene of the cluster, but sacE, was performed. All of the genetic mutants were obtained following the disruption strategy previously described.

FIG. 6 is a summary of the different mutants constructed in this invention as well as a summary of the compounds produced by the mutants as a consequence of the gene disruption. In the wild type strain both safracin A and B and other compounds, P2 and P14, were clearly detected by HPLC (see FIG. 6,WT). The gene disruption of the saca (PM-S1-002), sacB (PM-S1-003), sacC (PM-S1-004), sacD (PM-S1-010), sacF (PM-S1-008), and sacG (PM-S1-009), genes generated mutants that were unable to produce neither safracin A and safracin B, nor the precursor compounds with retention times beneath 15 min, P2 and P14 respectively. The structure elucidation of P14 and P2 revealed that P14 is a 3-methyl-O-methyl tyrosine, where as P2 is a 3-hydroxy-5-methyl-O-methyl tyrosine. Because of the small size of the sacE gene, the sacE⁻ mutant was not possible to be obtained by gene disruption, but deletion of this gene is in process. The overexpression of SacE protein, in trans, had no effect on safracin B/A production. The sacI⁻ mutants (PM-S1-006) produced P2, P14 and significant amount of a compound called P19B (FIG. 6; FIG. 7 b*). Structure elucidation of P19B revealed that this compound is a safracin-like molecule in which the N-Met and one of the OH from the quinone ring are missing. In the sacJ⁻ mutants (PM-S1-005), P2, P14, P19B and two new compounds called P22A and P22B were obtained (FIG. 6; FIG. 7 a*). Structure elucidation of P22A and P22B revealed that they are safracin A and safracin B like molecules, respectively, without one of the —OH group from the quinone ring. The biological assay of the sacI⁻ and the sacJ⁻ mutants extracts revealed very low activity against Bacillus subtilis.

The disruption of sacI gene with the reconstitution of the sacJ gene expression resulted in a new safracins producer mutant, PM-S1-007. The two antibiotics produced, at levels of production as high as the levels of safracin A and safracin B in the wild type strain, have been named as safracin D and safracin E (FIG. 7 c*). The safracin D and safracin E are safracin B and safracin A like molecules, respectively, where the N-methylation is missing.

These results strongly suggest that i) sacA, sacB and sacC genes encode for the safracin NRPSs; ii) sacD, sacF and sacG genes are responsible for the transformation of L-Tyr into the L-Tyr derivative P2 and iii) sacI and sacJ are responsible for the tailoring modifications that convert P19 and P22 into safracin.

Characterization of Natural Precursors:

Strain: Pseudomonas fluorescens A2-2 (wild type) (PM-S1-001) Fermentation Conditions:

Seed medium YMP3 containing 1% glucose; 0.25% beef extract; 0.5% bacto-peptone; 0.25% NaCl; 0,8% CaCO3 was inoculated with 0.1% of a frozen vegetative stock of the microorganism, and incubated on a rotary shaker (250 rpm) at 27° C. After 30 h of incubation, the 2% (v/v) seed culture was transferred into 2000 ml Erlenmeyer flasks containing 250 ml of the M-16B production medium, composed of 15.2% mannitol; 3.5% Dried brewer's yeast; 1.4% (NH₄)₂ SO₄; 0.001%; FeCl₃; 2.6% CO₃Ca. The temperature of the incubation was 27° C. from the inoculation till 40 hours and then, 24° C. to final process (71 hours). The pH was not controlled. The agitation of the rotatory shaker was 220 rpm with 5 cm eccentricity.

Isolation:

After 71 hours of incubation, 2 Erlenmeyer flasks were pooled and the 500 ml of fermentation broth was clarified by 7.500 rpm centrifugation during 15 minutes. 50 grams of the resin XAD-16 (Amberlite) were added to the supernatant and mixed during 30 minutes at room temperature. Then, the resin was recovered from the clarified broth by filtration. The resin was washed twice with distilled water and extracted with 250 ml of isopropanol (2-PrOH). The alcohol extract was dried under high vacuum till obtention of 500 mg crude extract. This crude was dissolved in methanol and purified by chromatographic column using Sephadex LH-20 and methanol as mobile phase. The P-14 compound was eluted and dried as a 15 mg yellowish solid. The purity was tested by analytical HPLC and ¹H NMR.

P-14 was also isolated in a similar way from cultures of the sacJ⁻ - mutant (PM-S1-005), using semipreparative HPLC as the last step in the purification process.

Biological Activities:

NO ACTIVE

Spectroscopic Data:

ESMS m/z 254 (C₁₁H₁₄NO₃Na₂ ⁺), 232 (C₁₁H₁₅NO₃Na⁺), 210 (M+H⁺). ¹H RMN (300 MHz, CD₃OD): 7.07 (d, J=8.1 Hz, H-9), 7.06 (s, H-5), 6.84 (d, J=8.1 Hz, H-8), 3.79 (s, H-11), 3.72 (dd, J=8.7, 3.9 Hz, H-2), 3.20 (dd, J=14.4, 3.9 Hz, H-3a), 2.91 (dd, J=14.4, 8.9 Hz, H-3b), 2.16 (s, H-10). ¹³C RMN (75 MHz, CD₃OD): 174.1 (C-1), 158.6 (C-7), 132.5 (C-5), 128.9 (C-9), 128.5 (C-4), 128.0 (C-6), 111.4 (C-8), 57.6 (C-2), 55.8 (C-11), 37.4 (C-3), 16.3 (C-10)

Strain: Pseudomonas fluorescens A2-2 (wild type) (PM-S1-001) Fermentation Conditions:

The same process than P-14

Isolation:

Similar procedure as the P-14, except in the Sephadex chromatography, where the fractions containing P-2 have eluted later. A semi-preparative HPLC step (Symmetry Prep C-18 column, 7.8×150 mm, AcONH₄ 10 mM pH=3/CH₃CN 95:5 held for 5 min and then gradient from 5 to 6.8% of CH₃CN in 3 min) has been necessary to purify the P-2. Also this compound has been isolated from the fermentation broth of the Pseudomonas putida ATCC12633+pB5H83 (PM-17-004) as result of heterologous expression.

Biological Activities:

NO ACTIVE

Spectroscopic Data:

ESMS m/z 226 [M+H]⁺; ¹H RMN (CD₃OD, 300 MHz): 6.65 (d, J=1.8 Hz, H-5), 6.59 (d, J=1.8 Hz, H-9), 3.72 (s, H-11), 3.71 (dd, J=9.0, 4.2 Hz, H-2), 3.16 (dd, J=14.4, 4.2 Hz, H-3a), 2.83 (dd, J=14.4, 9.0 Hz, H-3b), 2.22 (s, H-10); ¹³C RMN (DMSO, 75 MHz): 170.88 (s, C-1), 150.025 (s, C-7), 144.56 (s, C-8), 132.28 (s, C-4), 130.36 (s, C-6), 121.73 (d, C-5), 115.55 (d, C-9), 59.06 (q, 7-OMe), 55.40 (d, C-2), 36.21 (t, C-3), 15.86 (q, 6-Me).

Characterization of Safracins like Compounds Obtained by Knock Out

Strain: sac J⁻ mutant from P. fluorescens A2-2 (PM-S1-005) Fermentation conditions:

50 liters of the SAM-7 medium (50 l) composed of dextrose (3.2%), mannitol (9.6%), dry brewer's yeast (2%), ammonium sulphate (1.4%), potassium secondary phosphate (0.03%), potassium chloride (0.8%), Iron (III) chloride 6-hydrtate (0.001%), L-tyrosine (0.1%), calcium carbonate (0.8%), poly-(propylene glycol) 2000 (0.05%) and antifoam ASSAF 1000 (0.2%) was poured into a jar-fermentor (Bioengineering LP-351) with 75 l total capacity and, after sterilization, sterile antibiotics (amplicillin 0.05 g/l and kanamycin 0.05 g/l) were added. Then, it was inoculated with seed culture (2%) of the mutant strain PM-S1-005. The fermentation was carried out during 71 h. under aerated and agitated conditions (1.0 l/l/min and 500 rpm). The temperature was controlled from 27° C. (from the inoculation till 24 hours) to 25° C. (from 24 h to final process). The pH was controlled at pH 6.0 by automatic feeding of diluted sulphuric acid from 22 hours to final process.

Isolation

The whole broth was clarified (Sharples centrifuge). The pH of the clarified broth was adjusted to pH 9.0 by addition of NaOH 10% and extracted with 25 liters of ethyl acetate. After 20′ mixing, the two phases were separated. The organic phase was frozen overnight and then, filtered for removing ice and evaporated to a greasy dark green extract (65.8 g). This extract was mixed with 500 ml hexane (250 ml two times) and filtered for removing hexane soluble impurities. The remaining solid, after drying, gave a 27.4 g of a dry green-beige extract.

This new extract was dissolved in methanol and purified by a Sephadex LH-20 chromatography (using methanol as mobile solvent) and the safracins-like compounds were eluted in the central fractions (Analyzed on TLC conditions: Silica normal phase, mobile phase: EtOAc:MeOH 5:3. Aprox. Rf valor: 0.3 for P-22B, 0.25 P-22A and 0.1 for P-19).

The pooled fractions, (7,6 g) containing the three safracin-like compound were purified by a Silica column using a mixture of EtOAc:MeOH from 50:1 to 0:1. and other chromatographic system (isocratic CHCl₃:MeOH:H₂O:AcOH 50:45:5:0.1). Compounds P22-A, P22-B and P19-B were purified by reversed-phase HPLC (SymmetryPrep C-18 column 150×7.8 mm, 4 mL/min, mobile phase: 5 min MeOH:H₂O (0.02% TFA) 5:95 and gradient from MeOH:H₂O (0.02% TFA) 5:95 to MeOH 100% in 30 min).

Biological Activities of Safracin P-22B

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 Safra- GI50 4.58E−06 3.08E−07 8.49E−07 3.02E−06 8.24E−07 5.20E−07 4.71E−06 cin TGI 8.62E−06 6.08E−07 2.30E−06 7.04E−06 2.28E−06 9.99E−07 8.83E−06 P-22B LC50 1.62E−05 1.20E−06 1.21E−05 1.65E−05 8.85E−06 2.01E−06 1.66E−05 Primary Leukemia Pancreas Colon Cervix Screening K-562 PANC1 HT29 LOVO LOVO-DOX HELA HELA-APL Safra- GI50 1.13E−07 4.77E−06 1.01E−06 2.54E−06 6.95E−06 7.61E−07 4.65E−07 cin TGI 4.67E−07 1.17E−05 2.75E−06 6.84E−06 1.90E−05 1.83E−06 9.32E−07 P-22B LC50 1.84E−06 >1.90E−05   1.86E−05 1.84E−05 >1.90E−05   7.42E−06 1.86E−06 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): 10 mm inhibition zone Spectroscopic Data:

HRFABMS m/z 509.275351 [M-H₂O+H]⁺ (calcd for C₂₈H₃₇N₄O₅ 509.276396 Δ1.0 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 509 [M-H₂O+H]⁺ (5), 460 (2.7), 391 (3).

¹H NMR (CD₃OD, 500 MHz): 6.70 (s, H-15), 6.52 (s, H-5), 4.72 (bs, H-11), 4.66 (d, J=2.0 Hz, H-21), 4.62 (dd, J=8.4, 3.7 Hz, H-1), 3.98 (bd, J=7.6 Hz, H-13), 3.74 (s, 7-OMe), 3.71 (s, 17-OMe), 3.63 (m, overlapped signal, H-25), 3.62 (m, overlapped signal, H-3), 3.30 (m, H-22a), 3.29 (m, H-14a), 3.18 (d, J=18.6 Hz, H-14b), 2.90 (m, H-4a), 2.88 (m, H-22b), 2.76 (s, 12-NMe), 2.30 (s, 16-Me), 2.22 (m, H-4b), 1.16 (d, J=7.4 Hz, H-26);

¹³C NMR (CD₃OD, 125 MHz): 170.75 (s, C-24), 149.24 (s, C-18), 147.54 (s, C-8), 145.95 (s, C-7), 145.82 (s, C17), 133.93 (s, C-16), 132.31 (s, C-9), 131.30 (s, C-6), 128.95 (s, C-20), 121.93 (d, C-15), 121.76 (d, C-5), 121.44 (s, C-10), 112.45 (s, C-19), 92.87 (d, C-21), 60.86 (q, 7-OMe), 60.76 (q, 17-OMe), 59.39 (d, C-11), 57.96 (d, C-13), 55.51 (d, C-1), 54.29 (d, C-3), 50.08 (d, C-25), 45.55 (t, C-22), 40.43 (q, 12-NMe), 32.56 (t, C-4), 25.84 (t, C-14), 17.20 (q, C-26), 16.00 (q, 16-Me), 15.81 (q, 6-Me).

Strain: The same as for P-22B Fermentation Conditions: The same as for P-22B Isolation: The same as for P-22B Biological Activities of Safracin P-22A Antitumor Activities

Cells Lines (Mol/L) Prostate Ovary Breast Melanoma NSCL Primary Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 Safracin P-22A GI50 >1.96E−05 4.19E−06 7.74E−06 1.30E−05 1.27E−05 5.93E−06 >1.96E−05 TGI >1.96E−05 9.26E−06 1.96E−05 >1.96E−05   >1.96E−05 1.33E−05 >1.96E−05 LC50 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 Leukemia Pancreas Colon Cervix Primary Screening K-562 PANC1 HT29 LOVO LOVO-DOX HELA HELA-APL Safracin P-22A GI50 3.15E−06 >1.96E−05   1.26E−05 >1.96E−05 >1.96E−05 8.75E−06 7.66E−06 TGI 7.93E−06 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05   1.96E−05 LC50 1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 >1.96E−05 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): NO ACTIVE Spectroscopic data:

HRFABMS m/z 511.290345 [M+H]⁺ (calcd for C₂₈H₃₉N₄O₅ 511.292046 A 1.7 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 511 [M+H]⁺ (61), 409 (25), 391 (4); ¹H NMR (CD₃OD, 500 MHz): 6.68 (s, H-15), 6.44 (s, H-5), 3.71 (s, 7-OMe), 3.67 (s, 17-OMe), 2.72 (s, 12-NMe), 2.28 (s, 16-Me), 2.20 (s, 6-Me), 0.87 (d, J=7.1 Hz, H-26);

Strain: The same as for P-22B Fermentation Conditions: The same as for P-22B Isolation The same as for P-22B Biological Activities of Safracin P-19B Antitumor Activities

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 Safracin GI50   1.70E−05 3.90E−06 5.42E−06 8.74E−06 7.08E−06 7.90E−06 >1.95E−05 P-19B TGI >1.95E−05 8.06E−06 1.48E−05 >1.95E−05 1.92E−05 >1.95E−05 >1.95E−05 LC50 >1.95E−05 1.67E−05 >1.95E−05 >1.95E−05 >1.95E−05 >1.95E−05 >1.95E−05 Primary Leukemia Pancreas Colon Cervix Screening K-562 PANC1 HT29 LOVO LOVO-DOX HELA HELA-APL Safracin GI50 2.38E−06   1.81E−05   1.55E−05 >1.95E−05   1.44E−05 6.73E−06 4.80E−06 P-19B TGI 5.77E−06 >1.95E−05 >1.95E−05 >1.95E−05 >1.95E−05 1.61E−05 1.00E−05 LC50 1.40E−05 >1.95E−05 >1.95E−05 >1.95E−05 >1.95E−05 >1.95E−05   1.95E−05 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): NO ACTIVE Spectroscopic Data:

HRFABMS m/z 495.260410 [M-H₂O+H]⁺ (calcd for C₂₇H₃₅N₄O₅ 495.260746 Δ0.3 mmu); LRFABMS using m-NBA as matrix m/z (rel intensity) 495 [M-H₂O+H]⁺ (13), 460 (3), 391 (2); ¹H NMR (CD₃OD, 500 MHz): 6.67 (s, H-15), 6.5 (s, H-5), 3.73 (s, 7-OMe), 3.71 (s, 17-OMe), 2.29 (s, 16-Me), 2.24 (s, 6-Me), 1.13 (d, J=7.1 Hz, H-26);

New Safracin Compounds Obtained by Knock Out

Strain: sac I⁻ with sacJ expression reconstitution from P. fluorescens A2-2 (PM-S1-007) Fermentation Conditions: 50 liters of the SAM-7 medium (50 l) composed of dextrose (3.2%), mannitol (9.6%), dry brewer's yeast (2%), ammonium sulphate (1.4%), potassium secondary phosphate (0.03%), potassium chloride (0.8%), Iron (III) chloride 6-hydrtate (0.001%), L-tyrosine (0.1%), calcium carbonate (0.8%), poly-(propylene glycol) 2000 (0.05%) and antifoam ASSAF 1000 (0.2%) was poured into a jar-fermentor (Bioengineering LP-351) with 75 l total capacity and, after sterilization, sterile antibiotics (amplicillin 0.05 g/l and kanamycin 0.05 g/l) were added. Then, it was inoculated with seed culture (2%) of the mutant strain PM-S1-007. The fermentation was carried out during 89 h. under aerated and agitated conditions (1.0 l/l/min and 500 rpm). The temperature was controlled from 27° C. (from the inoculation till 24 hours) to 25° C. (from 24 h to final process). The pH was controlled at pH 6.0 by automatic feeding of diluted sulphuric acid from 27 hours to final process. Isolation:

The cultured medium (45 l) thus obtained was, after removal of cells by centrifugation, adjusted to pH 9.5 with diluted sodium hydroxide, extracted with 25 liter of ethyl acetate twice. The mixture was carried out into an agitated-vessel at room temperature for 20 minutes. The two phases were separated by a liquid-liquid centrifuge. The organic phases were frozen at −20° C. and filtered for removing ice and evaporated until obtention of a 35 g. oil-dark-crude extract. After a 5 l. hexane triturating, the extract (12.6 g) was purified by a flash-chromatographic column (5.5 cm diameter, 20 cm length) on silica-normal phase, mobile phase: Ethyl acetate: MeOH: 1 L of each 1:0; 20:1; 10:1; 5:1 and 7:3. 250 ml-fractions were eluted and pooled depending of the TLC (Silica-Normal, EtOAc:MeOH 5:2, Safracin D Rf 0.2, safracin E 0.05). The fraction containing impure safracin D and E was evaporated under high vacuum (2.2 g). An additional purification step was necessary to separate D and E on similar conditions (EtOAc:MeOH from 1:0 to 5:1), from this, the fractions containing safracin D and E are separate and evaporated and further purification by Sephadex LH-20 column chromatography eluted with methanol.

The safracins D and E obtained were independent precipitated from CH₂Cl₂ (80 ml) and Hexane (1500 ml) as a green/yellowish-dried solid (800 mg safracin D) and (250 mg safracin E).

Biological Activities Safracin D

Antitumor Screening:

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Leukemia Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 K-562 Safracin D GI50 5.22E−06 1.54E−06 2.68E−06 1.33E−06 4.71E−06 3.51E−06 6.04E−06 6.04E−07 TGI 9.99E−06 4.12E−06 6.02E−06 3.34E−06 7.82E−06 6.21E−06 1.07E−05 1.16E−06 LC50 1.90E−05 9.78E−06 1.35E−05 9.15E−06 1.30E−05 1.10E−05 1.88E−05 3.78E−06 Primary Pancreas Colon Cervix Screening PANC1 HT29 LOVO LOVO-DOX HELA HELA-APL Safracin D GI50 4.77E−06 4.33E−06 6.99E−06 4.75E−06 3.76E−06 2.28E−06 TGI 1.10E−05 1.79E−05 1.82E−05 8.85E−06 6.68E−06 5.24E−06 LC50 >1.90E−05   >1.90E−05   >1.90E−05   1.65E−05 1.19E−05 1.21E−05 Secondary Evaluation (Mol/L) Macromolecules Synthesis Apoptosis DNA Binding Secondary Screening PROTEIN DNA RNA NUCLEOSOMES GEL Safracin D IC50 1.90E−05 1.52E−05 3.80E−06 2.85E−06 6.65E−06 Antimicrobial activity: On solid medium Bacillus subtils. 10 μg/disk (6 mm diameter): Inhibition zone: 15 mm diameter Spectroscopic Data

ESMS: m/z 509 [M-H₂O+H]⁺; ¹H NMR (CDCl₃, 300 MHz): 6.50 (s, C-15), 4.02 (s, OMe), 3.73 (s, OMe), 2.22 (s, Me), 1.85 (s, Me), 0.80 (d, J=7.2 Hz); ¹³C NMR (CDCl₃, 75 MHz): 186.51, 181.15, 175.83, 156.59, 145.09, 142.59, 140.78, 137.84, 131.20, 129.01, 126.88, 121.57 (2×C), 82.59, 60.92, 60.69, 53.12, 21.40, 50.68, 50.22, 48.68, 40.57, 29.60, 25.01, 21.46, 15.64, 8.44.

Strain: The same than safracin D Fermentation Conditions: The same batch as safracin D Isolation: See safracin D conditions Biological Activities Safracin E Antitumor screening:

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Leukemia Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 K-562 Safracin E GI50 8.34E−06 3.86E−06 4.50E−06 4.54E−06 5.05E−06 3.94E−06   1.96E−05 4.25E−06 TGI 1.96E−05 7.70E−06 8.85E−06 8.25E−06 9.24E−06 6.93E−06 >1.96E−05 8.21E−06 LC50 >1.96E−05   1.54E−05 1.74E−05 1.49E−05 1.70E−05 1.22E−05 >1.96E−05 1.59E−05 Primary Pancreas Colon Cervix Screening PANC1 HT29 LOVO LOVO-DOX HELA HELA-AP Safracin E GI50 6.05E−06 7.89E−06 7.15E−06 5.07E−06 4.15E−06 4.03E−06 TGI 1.47E−05 1.96E−05 >1.96E−05 9.44E−06 7.29E−06 7.25E−06 LC50 >1.96E−05 >1.96E−05 >1.96E−05 1.75E−05 1.28E−05 1.30E−05 Secondary Evaluation (Mol/L) Macromolecules Synthesis Apoptosis DNA Binding Secondary Screening PROTEIN DNA RNA NUCLEOSOMES GEL Safracin E IC50 1.57E−05 >1.96E−05 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): 9.5 mm inhibition zone Spectroscopic Data

ESMS: m/z 511 [M+H]⁺; ¹H NMR (CDCl₃, 300 MHz): 6.51 (s, C-15), 4.04 (s, OMe), 3.75 (s, OMe), 2.23 (s, Me), 1.89 (s, Me), 0.84 (d, J=6.6 Hz); ¹³C NMR (CDCl₃, 75 MHz): 186.32, 181.28, 175.83, 156.43, 145.27, 142.75, 141.05, 137.00, 132.63, 128.67, 126.64, 122.00, 120.69, 60.69, 60.21, 59.12, 58.04, 57.89, 50.12, 49.20, 46.72, 39.88, 32.22, 25.33, 21.29, 15.44, 8.23.

Example 5 Cross-Feeding Experiments

Heterologous Expression of Safracin Biosynthetic Precursors Genes for P2 and P14 Production

In the attempt to shed light on the mechanism of the P2 and P14 biosynthesis we have cloned and expressed the downstream NRPS genes to determine their biochemical activity.

To overproduce P14, sacEFGH genes were cloned (pB7983) (FIG. 4). To overproduce P2 in a heterologous system, sacD to sacH genes were cloned (pB51183)(FIG. 4). For this purpose we PCR amplified fragments harboring the genes of interest using oligonucleotides that contain a XbaI restriction site at the 5′ end. Oligonucleotides PFSC79 (5′ CGTCTAGACACCGGCTFFCATGG-3′ SEQ ID NO: 22) and PFSC83 (5p GGTCTAGATAACAGCCAACAAACATA-3 SEQ ID NO: 23) were used to amplify sacE to sacH genes; and oligonucleotides 5HPTI-XB (5′-CATCTAGACCGGACTGATATTCG-3′ SEQ ID NO: 24) and PFSC83 (5′- GGTCTAGATAACAGCCAACAAACATA-3′ SEQ ID NO: 25) were used to amplify sacD to sacH genes. The PCR fragments digested with XbaI were cloned into the XbaI restriction site of the pBBR1-MCS2 plasmid (Kovach et al, Gene 19942 166p 175-176). The two plasmids, p137983 and pB5H83, were introduce separately into three heterologous bacteria P. fluorescens(CECT 378), P. putida(ATCC 12633) and P. stutzeri(ATCC 17588) by conjugation (see table II). When culture broth of the fermentation of the transconjugant strains was checked by HPLC analysis, big amounts of P14 compound was visualized in the three strains containing pB7983 plasmid, whereas big amounts of P2 and some P14 product were observed when pB5H83 plasmid was expressed in the heterologa bacteria.

Cross-Feeding

As it was shown in Example 4, the sacF⁻ (PM-S1-008) and sacG- (PM-S1-009) mutants were not able to produce neither safracins nor P2 and P14 compounds. The addition of chemically synthesized P2 to these mutants during their fermentation yields safracin production.

Moreover, the co-cultivation of an heterologous strain of P. stutzeri (ATCC 17588) harboring plasmid pB5H83 (PM-18-004), which expression produces P2 and P14, with either one of the two mutants sacF⁻ and sacG- resulted in safracin production. The co-cultivation of an heterologous strain P. stutzeri (ATCC 17588) harboring plasmid pB7983 (PM-18-005), which expression produces only P14, with either one of the two P. fluorescens A2-2 mutants mentioned before resulted in no safracin production at all. These results suggest that P14 is transformed into P2, a molecule that can easily be transported in and out through the Pseudomonas sp. cell wall and which presence it is absolutely necessary for the biosynthesis of safracin.

Example 6 Biological Production of New “Unnatural” Molecules

The addition of 2 g/L of a specific modified P2 derivative precursor, P3, a 3-hydroxy-5-methyl-O-ethyltyrosine, to the sacF mutant (PM-S1-008) fermentation yielded two “unnatural” safracins that incorporated the modified precursor P3 in its structure, Safracin A(OEt) and Safracin B(OEt).

Strain saf F⁻ mutant from P. fluorescens A2-2 (PM-S1-008) Fermentation Conditions:

Seed medium containing 1% glucose; 0.25% beef extract; 0.5% bacto-peptone; 0.25% NaCl; 0.8% CaCO3 was inoculated with 0.1% of a frozen vegetative stock of the microorganism, and incubated on a rotary shaker (250 rpm) at 27° C. After 30 h of incubation, the 2% (v/v) seed culture of the mutant PM-S1-008 was transferred into 2000 ml Erlenmeyer flasks containing 250 ml of the M-16 B production medium, composed of 15.2% mannitol; 3.5% Dried brewer's yeast; 1.4% (NH₄)₂ 0.001%; FeCl₃; 2.6% CO₃Ca and 0.2% P3 (3-hydroxy-5-methyl-O-methyltyrosine) The temperature of the incubation was 27° C. from the inoculation till 40 hours and then, 24° C. to final process (71 hours). The pH was not controlled. The agitation of the rotatory shaker was 220 rpm with 5 cm eccentricity.

Isolation

4×2000/250 ml Erlenmeyer flasks were joined together (970 ml), centrifuged (12.000 rpm, 4° C., 10′, J2-21 Centrifuge BECKMAN) to remove cells. The clarified broth (765 ml) was adjusted to pH 9.0 by NaOH 10%. Then, the alkali-clarified broth was extracted with 1:1 (v/v) EtOAc (×2). The organic phase was evaporated under high vacuum and a greasy-dark extract was obtained (302 mg).

This extract was washed by an hexane trituration for removing impurities and the solids were purified by a chromatography column using Silica normal-phase and a mixture of Ethyl Acetate: Methanol (from 12:1 to 1:1). The fractions were analyzed under UV on TLC (Silica 60, mobile phase EtOAc:MeOH 5:4. R_(f) 0.3 (Safracin B-OEt and 0.15 Safracin A-OEt). From this, safracins B OEt (25 mg) and safracin A OEt (20 mg) were obtained.

Biological Activities of Safracin B (OEt)

Antitumor Activities

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Leukemia Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 K-562 Safracin B (OEt) GI50 4.01E−07 4.84E−08 4.06E−08 6.82E−07 4.82E−08 1.69E−07 5.01E−07 3.97E−08 TGI 1.01E−06 >1.76E−05   9.97E−08 1.19E−06 1.16E−07 4.40E−07 1.16E−06 1.08E−07 LC50 1.60E−05 8.28E−07 4.27E−06 6.37E−06 1.02E−06 1.13E−06 5.66E−06 3.69E−06 Primary Pancreas Colon Cervix Screening PANC1 HT29 LOVO LOVO-DOX HELA HELA-APL Safracin B (OEt) GI50 6.49E−07 2.44E−07 4.43E−07 2.09E−06 8.92E−08 7.70E−08 TGI 2.06E−06 1.39E−06 1.09E−06 9.88E−06 3.15E−07 2.74E−07 LC50 1.35E−05 >1.76E−05   >1.76E−05   >1.76E−05   1.35E−06 9.76E−07 Secondary Evaluation (Mol/L) Secondary Macromolecules Synthesis Apoptosis DNA Binding Screening PROTEIN DNA RNA NUCLEOSOMES GEL Safracin B (OEt) IC50 >1.76E−05 1.76E−06 1.76E−07 5.28E−08 1.76E−05 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): 17,5 mm inhibition zone Spectroscopic Data:

ESMS: m/z 551 [M-H₂O+H]⁺; ¹H NMR (CDCl₃, 300 MHz): 6.48 (s, H-15), 2.31 (s, 16-Me), 2.22 (s, 12-NMe), 1.88 (s, 6-Me), 1.43 (t, J=6.9 Hz, Me-Etoxy), 1.35 (t, J=6.9 Hz, Me-Etoxy), 0.81 (d, J=7.2 Hz, H-26)

Strain: The same as for Safracin B (OEt) Fermentation conditions: The same as for Safracin B (OEt) Isolation:

4×2000/250 ml Erlenmeyer flasks were joined together (970 ml), centrifuged (12.000 rpm, 4° C., 10′, J2-21 Centrifuge BECKMAN) to remove cells. The clarified broth (765 ml) was adjusted to pH 9,0 by NaOH 10%. Then, the alkali-clarified broth was extracted with 1:1 (v/v) EtOAc (×2). The organic phase was evaporated under high vacuum and a greasy-dark extract was obtained (302 mg).

This extract was washed by an hexane trituration for removing impurities and the solids were purified by a chromatography column using Silica normal-phase and a mixture of Ethyl Acetate: Methanol (from 12:1 to 1:1). The fractions were analysed under UV on TLC (Silica 60, mobile phase EtOAc:MeOH 5:4. Rf 0.3 Safracin B-OEt and 0.15 Safracin A-OEt). From this, safracins B OEt (25 mg) and safracin A OEt (20 mg) were obtained.

Biological Activities of Safracin A (OEt):

Antitumor Activities

Cells Lines (Mol/L) Primary Prostate Ovary Breast Melanoma NSCL Screening DU-145 LN-caP IGROV IGROV-ET SK-BR3 SK-MEL-28 A549 Safracin A (OEt) GI50 2.64E−06 3.78E−07 4.92E−07 2.01E−06 5.55E−07 7.96E−07 4.00E−06 TGI 5.39E−06 7.42E−07 9.28E−07 5.10E−06 1.16E−06 1.90E−06 7.17E−06 LC50 1.10E−05 1.45E−06 1.76E−06 1.30E−05 5.57E−06 5.77E−06 1.28E−05 Primary Leukemia Pancreas Colon Cervix Screening K-562 PANC1 HT29 LOVO LOVO-DOX HELA HELA-AFL Safracin A (OEt) GI50 3.11E−07 3.06E−06 1.97E−06 2.03E−06 5.72E−06 1.02E−06 7.64E−07 TGI 6.86E−07 5.83E−06 4.41E−06 4.41E−06 9.84E−06 2.91E−06 2.32E−06 LC50 1.51E−06 1.11E−05 9.88E−06 9.61E−06 1.69E−05 7.85E−06 6.69E−06 Secondary Evaluation (Mol/L) Macromolecules Synthesis Apoptosis DNA Binding Secondary Screening PROTEIN DNA RNA NUCLEOSOMES GEL Safracin A (OEt) IC50 6.33E−06 1.81E−06 Antimicrobial activity: On solid medium Bacillus subtilis. 10 μg/disk (6 mm diameter): 10 mm inhibition zone Spectroscopic Data:

ESMS: m/z 553 [M+H]⁺; ¹H NMR (CDCl₃, 300 MHz): 6.48 (s, H-15), 2.33 (s, 16-Me), 2.21 (s, 12-NMe), 1.88 (s, 6-Me), 1.42 (t, J=6.9 Hz, Me-Etoxy), 1.34 (t, J=6.9 Hz, Me-Etoxy), 0.8 (d, J=6.9 Hz, H-26)

Example 7 Enzymatic Transformation of Safracin B into Safracin A

In order to assay the enzymatic activity of conversion of safracin B into safracin A, a 120 hours fermentation cultures (see conditions in Example.2.Biological assay (biotest) for safracin production) of different strains were collected and centrifuged (9.000 rpm×20 min.). The strains assayed were P. fluorescens A2-2, as wild type strain, and P. fluorescens CECT378+pBHPT3 (PM-19-006), as heterologous expression host. Supernatant were discarded and cells were washed (NaCl 0.9%) twice and resuspended in 60 ml phosphate buffer 100 mM pH 7.2. 20 ml from the cell suspension was distributed into three Erlenmeyer flask:

-   -   A. Cell suspension+Safracin B (400 mg/L)     -   B. Cell suspension heated at 100° C. during 10 min.+Safracin B         (400 mg/L) (negative control)     -   C. Cell suspension without Safracin B (negative control)

The biochemical reaction was incubated at 27° C. at 220 rpm and samples were taken every 10 min. Transformation of safracin B into safracin A was followed by HPLC. The results clearly demonstrated that the gene cloned in pBHPT3, sacH, codes for a protein responsible for the transformation of safracin B into safracin A.

Based on this results we did an assay to find out if this same enzyme was able to recognize a different substrate such as ecteinascidin 743 (ET-743) and transform this compound into Et-745 (with the C-21 hydroxy missing). The experiment above was repeated to obtain Erlenmeyer flasks containing:

-   -   A. Cell suspension+ET-743 (567 mg/L aprox.)     -   B. Cell suspension heated at 100° C. during 10 min.+ET-743(567         mg/L) (negative control)     -   C. Cell suspension without ET-743 (negative control)

The biochemical reaction was incubated at 27° C. at 220 rpm and samples were taken at o, 10 min, 1 h, 2 h, 3 h, 4 h, 20 h, 40 h, 44 h, 48 h. Transformation of ET-743 into ET-745 was followed by HPLC. The results clearly demonstrated that the gene cloned in pBHPT3, sacH, codes for a protein responsible for the transformation of Et-743 into Et-745. This demonstrates that this enzymes recognizes ecteinascidin as substrate and that it can be used in the biotransformation of a broad range of structures. 

1. An isolated nucleic acid sequence comprising: a) the nucleic acid sequence of SEQ ID NO:1; b) the sacABCDEFGH operon of SEQ ID NO:1; c) the sacA, sacB, sacC, sacD, sacE, sacF, sacG, and sacH genes of SEQ ID NO:1; d) a nucleic acid sequence encoding the amino acid sequence SEQ ID NO: 2, 3, 4, 5, 6, 7, 8 or 9; or e) a nucleic acid sequence that is the full complement to the sequence in a), b), c), or d).
 2. The nucleic acid sequence of claim 1, wherein the nucleic acid sequence comprises: a) the nucleic acid sequence of SEQ ID NO:1; or b) the nucleic acid sequence which is the full complement to the sequence in a).
 3. A vector comprising the nucleic acid sequence of claim
 1. 4. The vector of claim 3 which is an expression vector.
 5. The vector of claim 3 which is a cosmid.
 6. A composition comprising at least one nucleic acid sequence of claim
 1. 7. The nucleic acid of claim 1 wherein the nucleic acid sequence comprises the sacABCDEFGH operon.
 8. An isolated nucleic acid sequence comprising both the sacABCDEFGH operon and the sacIJ operon of SEQ ID NO:1.
 9. The nucleic acid of claim 8 wherein the sacI gene of the sacIJ operon is disrupted.
 10. The nucleic acid of claim 8 wherein the sacJ gene of the sacIJ operon is disrupted.
 11. The nucleic acid of claim 8 wherein the sacI gene of the sacIJ operon is disrupted and the expression of the sacJ gene has been reconstituted.
 12. The nucleic acid of claim 8 wherein the sacF gene and/or the sacG gene of the sacABCDEFGH operon has been disrupted.
 13. The nucleic acid sequence of claim 1 wherein the nucleic acid sequence comprises SEQ ID NO:
 1. 14. An isolated nucleic acid sequence comprising: a) the nucleic acid sequence of SEQ ID NO:1; b) the sacABCDEFGH operon of SEQ ID NO:1; or c) a nucleic acid sequence that is the full complement to the sequence in a) or b). 